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Hi.  ABSTRACT 


Dept,  .of  the  Navy,  Washington,  D.C. 

Monte  Carlo  studies  of  the  original  version  of  the  half-normal  plot  (Daniel, 
Technometrics ~1  (1959)  311-341)  and  two  new  versions  are  reported.  Data  representative 
of  the  15  contrasts  from  a  p-q  =4,  factorial  experiment  are  generated.  Design 

parameters  are  a.  the  expprimentvise  error  rate,  r,  th^  number  of  real  contrasts 
present  in  the  2 r  experiment,  and  m,  the  size  of  the  re-1  contrasts  present.  Studies 
are  made  for  designs  with  a  =  .05,  .2C,  .40,  r  =  0,  1,  2,  4,  and  6,  m  =  0(2?  )8o,  where 
e-is  the  standard  deviation  of  a  contract.  i 

We  give  critical  values  for  the  various  versions  which  control  the  experimentwise 
error  rate.  These  critical  values  are  considerably  different  than  those  given  by 
Daniel. 

Detection  rate,  i.e.,  the  proportion  of  real' contrasts  declared  significant,  false 
positive  behavior,  ard  estimation  of  o  are  examined.  The  Monte  Carlo  studies  indicate 
that  one  of  the  new  versions  is  superior  to  the  original,  version.  The  detection  rate 
of  all  versions  decreases  drastically  when  r  increases  from  one  to  two  to  four.  When 
several;, small  real  contrasts  are  present,  the  sensitivity  can  be  increased  and  the 
magnitude  of  the  average  errors  in  estimating  o  can  be  greatly  reduced  by  using 
q  .2  or  .4,  rather  than  a  =  .05.  .  . 

Nomination  procedures  for  analyzing  a  single  replicate  24  factorial  experiments 
have  a  smaller  detection  rate  than  the  half-normal  plot  with  an  equivalent  experiment- 
wise  error  rate,  unless  the  experimenter  can  accurately  nominate  ten  error  contrasts 
in  the  24  experiment. 
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AH  EftPIRICAL  STUDY  OF  THE  HALF-NORMAL  PLOT 


by 

Douglas  A.  Zahn 
Florida  State  University 


1.  INTRODUCTION 

The  half-normal  plot  as  introduced  by  Daniel  (1959)  is  a  multi¬ 
purpose  tool  for  criticizing  and  interpreting  the  contrasts  yielded  hy 
a  single  replication  factorial  experiment.  It  gives  us  a  means  for 
approaching  the  following  general  problem: 

.  Let  the  statistics  Y^,  Yg,  ...»  be  independent, 
normally  distributed  random  variables  with  means  p^,  Pg, 

...»  pQ  and  conaon  variance  a2.  Assume  that  at  most  r  of  the 
p^  are  non-zero,  where  r  is  small  relative  to  n,  and  that  ve 
have  no  prior  estimate  of  o.  Given  y^,  y2,  . . . ,  yn  (single 
observations  of  each  of  the  n  random  variables),  (l)  decide 
which,  if  any,  of  the  p^  are  non-zero  and  (2)  estimate  o. 

This  problem  arises  in  many  contexts.  The  motivating  one  was 
single  replication  factorial  experiments  where  we  can  think  of  y^  as 
a  contrast  estimating  the  effect  p^  of  a  factor  or  interaction  between 
factors  in  the  experiment.  The  half-normal  ploo  can  be  vised  to  decide 
which  of  the  p^  are  significantly  non- zero.  After  the  significant  con¬ 
trasts  been  isolated,ve  can  then  use  the  half-normal  plot  to  esti¬ 
mate  a  -  .cm  the  insignificant  contras-.: . 

Birnbaum  (1959)  investigated  the  probability  that  the  half-normal 


i 


plot  will  detect  a  single  real  contrast,  i.e.,  a  contrast  with  mean 

U  J*  0,  when  n  *  31.  He  fbund  that  it  compared  well  to  the  multiple 

t-test  in  this  situation.  However,  he  warned  that  his  findings  rested 

heavily  on  the  assumption  that  only  one  real  contrast  was  present.  He 

also  warned  that  if  wore  than  one  real  contrast  was  present,  the  power 

of  the  plot  may  be  greatly  reduced.  This  report  presents  results  cf 

a  ftonte  Carlo  study  of  Daniel*s  original  version  of  the  half-normal 

plot  and  two  modifications  of  it  when  as  many  as  six  real  contrasts  of 

size  lo  to  80  are  present  in  a  2**“^,  p-q  =  U,  factorial  experiment. 

Thus,  we  consider  the  power,  false  positive  behavior,  and  variance 

estimation  of  these  versions  of  the  half-normal  plot  when  they  are 
0 

applied  to  the  general  problem  in  the  cat._  n  *  15,  since  there  are  15 

'  -  L 

contrasts  of  interest,  ignoring  the  grand  mean,  in  a  2  factorial 
experiment.  However  these  versions,  or  obvious  modifications  of  them, 
can  be  used  for  any  n.  The  cases  n  =  8,  9,  ...»  15  can  be  analyzed 
using  critical  values  in  Table  3.1.  Critical  values  for  other  cases 
can  be  constructed  using  procedures  described  in  Section  3.  Additional 
results  are  presented  in  Zahn  (1969)  and  are  available  from  the  author 
on  request. 

2.  VERSIONS  OF  THE  HALF-NORMAL  PLOT 

To  investigate  the  effects  of  modifying  various  steps  of  the 
half-no raal  plot  on  sensitivity  and  variance  estimates,  we  have  in¬ 
cluded  a  comparison  of  several  versions  of  the  half- normal  plot  in 
our  ?fonte  Carlo  investigation. 


2.1  ve/16 -ton  x 

Calculate  the  order  statistics,  x^  _<  x,,  •••  _<  x^,.,  of  'the 

absolute  values  of  the  15  observed  contrasts.  Divide  the  order 
statistics  by  s^,  the  initial  estimate  of  c.  In  this  version 
*  x^.  This  produces  a  set  of  standardized  order  statistics, 
t^  _<  tg  <_  •••  <_  t^,  which  are  also  used  as  test  statistics  to 
examine  the  order  statistics  for  significance. 

This  set  of  scale-free  order  statistics  may  be  plotted  on  a .... 
revised  standardized  half- normal  grid,  illustrated  in  Figure  2.1. 

The  .05,  .20,  and  VU0  level,  critical  values  for  this  version  are 
indicated  by  the  points  on  the  grid  which  are  connected  by  lines  to 
Ibn  the  .05,  .20,  and  .bO  level  guardrails,  respectively.  This  grid 
differs  from  the  standardized  half-£oraal  grid  given  in  Daniel  (1959) 
in  three  va&sz 

(1)  The  standardised  order  statistics  are  plotted  as  the 
ordinate,  rather  than  as  the  abscissa,  as  in  Daniel 
(1959),  to  make  the  half-normal  plot  correspond  more 

_  :  closely  to  the  usual  regression  graph  on  which  the 
random  variable  is  plotted  as  the  ordinate. 

(2)  Tb  take  advantage  of  benefits  cited  by  Fergtnon  (I960), 
the  standardized  order  statistics  are  plotted  against 
the  mean  of  the  order,  statistics  of  a  random  sample 

of  size  1?  from  the  standard  half-normal  distribution. 


These  means  have  been  computed  by  Blankenship  (1965) 
and  are  given  in  Table  2.1  for  samples  of  sizes  1(1  )15. 

(3)  The  guardrails  given  on  the  revised  grid  differ  con¬ 
siderably  from  those  on  the  grids  in  Daniel  (1959). 
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FIGURE  2.1 
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TABLE  2.1 

Meant  oi  the  Order  Statistics  o{,  Random  Sami c*  o&  Sizes 
1,,-Sv  * . .  *  7  F-toffi  £fts»  StandcuiA  Half-Hormzt  Distribution 

The  table  entry  in  rov  i  and  column  J  is  ra(i,j)  =  the 
th 

mean  of  the  i  *  order  statistic  in  a  random  sample  of 
size  j  from  the  standard  half-  normal  distribution. 


2 

3 

U 

5 

6 

7 

0.1*67 

0.335 

6.262 

0.216 

^183 

O.16O 

1.128 

0.732 

0.553 

0.1*M 

.0.377 

0.326 

1.326 

0.911 

0.712 

0.589 

0.501* 

l.*;65 

l.ol>4 

0.835 

0.702 

1.570 

l.ll*9 

0.93k 

I.65I* 

1.235 

-W-^  -  -  *<**«5v***^  \» 


TABLE  2.1,  continued 

Mean*  0)(  the.  Ondzn.  Statistic s  of  Pandon  Samples  of,  Sizes 
8,  9,  ...»  15  ffiom  the  Standard  Ualf-Uonml  PistfuJhivUon 

The  table  entry  in  row  i  and  column  j  is  m(i,j)  =  the  mean 
of  the  i  order  statistic  in  a  random  sample  of  site  j 
from  the  standard  half-normal  distribution. 


10  11  12-  13  14  15 
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She  computation  of  our  guardrails  and  the  error  rate 
they  are  intended  to  control  are  considered  in  Section  3* 

Table  3.1  gives  the  critical  values  on  which  the  guardrails  for 
this  versi  on  and  the  subsequent  two  versions  are  based. 

The  detection  process  is  a  sequential  statistical  procedure  in 
that  whether  contrasts  are  tested  for  significance  late  in  the  pro* 
cess  deperds  on  results  observed  early  in  the  process.  We  first 
test  for  significance  by  comparing  test  statistic  15  m*am 
*PProPr*ll't*  critical  value  c^.  If  t^>  c-_.  2^ 
is  declared  significant  and  we  then  examine  Xlh*  We  continue  testing 
contrasts  until  one  is  declared  insignificant  or  until  a  maximum  of 
four  contrasts  are  declared  significant. 

If  desired,  the  detection  process  can  be  carried  out  without 
graphing  the  order  statistics.  A*  need  merely  compute  the  appropriate 
test  statistics  and  compare  them  to  the  critical  values.  Of  course, 
the  beauty  of  the  plot  is  that  it  also  enables  us  to  examine  the  bon* 
trusts  for  the  abnormalities  discussed  by  Daniel  (1959)* 

The  detection  process  divides  the  contrasts  into  two  sets: 

significant  contrasts  aci  insignificant  contrasts.  The  latter  set 

will  be  referred  to  as  error  contrasts  because  we  calculate  sf,  our 

final  estimate  of  o,  from  them,  bet  e  denote  the  mmber  of  error 

contrasts.  We  plot  on  ordinary  line ar-by-line ar  graph  paper 

x^,  Xg,  ...,  xe  against  x(i,e),  i  «  1,  2,  ...»  e,  where  m(i,e} 

th 

denotes  the  mean  of  the  i  order  statistic  in  a  random  sample  of  sixe 
e  from  the  standard  half-no nanl  distribution.  We  fit  the  least  squares 
regression  line  through  the  origin  of  x  on  m.  The  slope  of  this  line 


i»  ifl  the  final  eatlaate  of  a  v  widch  should  not  he  confused  with  s^, 
the  initial  estiaate  of  o. 

Tho  cases  n  *  lk,  13,  and  12  can  also  be  examined  using  this 
version  with  x^aa  the  'test  statistic  denominator  and  the  critical 
values  in  Table  3,1,  provided  that  the  researcher  is  willing  to  assume 
that  no  more  than  three,  two,  or  one  real  contrasts  are  present, 
respectively.  For  instance,  if  n  *  13,  the  first  test  statistic  is 
t^^  *  3^*11  «-d  its  .05  level  critical  value  is  2.36;  the  second 

is  t19  *  x^/x^  and  its  .05  level  critical  value  is  1.9*», 


2.2  Vvubion  S 


Versions  S  and  X  only  differ  with  respect  to  the  test  statistic 
denominators,  used  during  the  detection  process.  We  define 

8L(k,i)i«  f  x.*(i,4)  /  f  [m(i,J>]  ,  k  < 
i*l  '  i*l 

The  statistic;  SL(U,15)  is  used  as  test  statistic  denominator  by 
version  3,  This  is  the  slope  of  the  least  squares  regression  line 
through  the  origin  of  x  on  a  fitted  to  the  points  [m(i,15),xi], 
i  *  1,  2,  uii  11,  For  this  version,  the  standardized  order  statistics 
***  ^5  *  x^/Sli(ll,15),  etc.  Thus,  the  test  statistic  denominators 
of  version  S  should  be  less  variable  estimators  of  a  than  the  test 
statistic  dsnoainatort  of  X,  since  the  denominators  of  S  are  based  ' 
on  acre  intonation.  Hence,  the  guardrails  on  the  revised  standardized 
half-norasl  grid  tor  version  S  illustrated  In  Figure  2,2  differ  from 
those  on  the  grid  for  version  X.  Again,  four  lt>  the  asxiaua  mafcer 
of  contrasts  which  may  be  ds&ared  significant.  The  final  estiaate 


FIGURE  2.2 


Rlvibtd  StaiukvuUztd  H<U(-No*mI  Quid  (ok 
VtnUon  8  (ok  15  CoiUKatti 


of  o  is  obtained  exactly  as  it  is  by  version  X.  Using  the  above 
notation  ve  see  that,  if  there  are  e  error  contrasts,  sf  =  SL(e,e). 

2.3  VzA&ion  R 

Only  one  of  three  other  half-normal  plot  versions  in¬ 
vestigated  in  our  Jtonte  Carlo  study  will  be  considered  here.  The 
first  step  in  version  R  is  to  compute  the  standardized  order 
statistics  *  ::./SL(7,15)»  i  -  1,  2}  ...»  15-  Here  the  test 
statistic  denominator  is  the  least  squares  line  through  the  origin 
fitted  "o  the  smallest  half  of  the  contrasts  being  tested  for 
significance.  If  is  declared  significant,  the  remainder  of 
the  detection  process  in  this  version  differs  considerably  from  the 
detection  process  in  versions  X  and  S, 

We  reassess  our  position  every  time  a  contrast  is  declared 
significant.  This  step  reflects  the  fact  that  a  contrast  appearing 
significantly  large  alters  our  assessment  of  the  state  of  nature. 

If  Xj^  is  declared  significant,  we  then  ermine  the  remaining  lU 
absolute  contrasts  under  the  hypothesis  that  they  constitute  a 
random  sample  of  size  lU  from  the  half-normal  distribution.  These 
14  order  statistics  are  restandardized  by  dividing  by  SL( 7 ,1** )  which 
Is  an  unbiased  estimate  of  o  under  this  hypothesis.  We  nov  compare 
the  restandardized  value  of  to  the  appropriate  critical  value. 

If  it  ia  declaijid  significant,  we  concentr-vte  on  the  :  r. v»lning  13 
absolute  contrasts,  considering  them  as  a  random  sample  of  size  13 
from  the  half-normal  distribution.  Vie  restandardize  them  by  dividing 
by  SL(6,13).  If  is  not  declared  significant,  s,,  is  computed  as 
before  from  the  lU  error  contrasts. 
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Another  difference  between  this  version  and  the  previous  two 
is  that  R  say  declare  as  many  as  seven  contrasts  significant,  lb 
gain  this  additional  flexibility,  we  have  based  on  the  test  statistic 
denominators  on  at  most  the  seven  smallest  absolute  contrasts. 

3.  CRITICAL  VALUES  FOR  THE  HALF-BORIfM,  PLOT  VERSIONS 

Before  any  version  can  be  vised,  we  need  critical  values  for 
judging  the  test  statistics.  Clearly,  each  version  requires  several 
critical  values  since  each  may  declare  more  than  one  contrast 
significant.  The  a  level  critical  value  for  a  given  test  statistic 
t  is  defined  to  be  the  (l-o)  quantile  of  it&  distribution  under  the 
hypothesis  H:  all  contrasts  currently  being  tested  for  significance 
have  means  u  =  0.  For  example,  for  t^  *  x^/x^  for  version  X, 
the  .05  level  critical  value  is  the  .95  quantile  of  its  distribution 
under  the  hypothesis  H:  the  13  contrasts  being  tested  for  significance 
all  have  means  y  «  0.  Table  3.1  gives  .05,  .20,  and  .1*0  level 
critical  values  for  all  versions.  We  determined  these  critical 
values  by  generating  an  empirical  distribution  based  on  999  simulations 
for  each  test  statistic  and  estimating  the  .95,  .80,  and  .60  quantiles 
by  the  corresponding  quantiles  of  the  empirical  distribution. 

The  precision  of  our  estimated  quantiles  can  be  evaluated  using 
methods  described  by  Wilxs  (1962,  p.  331).  We  are  90?  confident 
that  the  critical  values  given  in  Table  3.1  are  correct  to  within 
♦  .10,  except  for  the  .05  level  critical  values  for  version  R.  We 
are  90?  confident  that  these  values  are  correct  to  within  ♦  .25. 


r*  j'y  ‘ 
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TABLE  3.1 

.05,  .20,  ana  ,4n  Level  Critical  Value*  o(  the.  Tut  <1 tatistic*  Used  By 

All  Vtulon*. 

These  o  level  critical  values  are  the  corresponding  (l-a)  quantiles  of 
the  simulated  empirical  distributions. 


Version 


Level  a  a 


Test  Statistic 


*15 

*14 

*13 

*12 

*11 

*10 

‘9 

.05 

!i  13 

4.11 

4.31 

4.17 

4.43 

4.15 

4.69 

,20 

2.94 

2.89 

2.96 

2.02 

2.94 

2.84 

2.90 

,1*0 

2.32 

■>.26 

2.27 

2.22 

2.22 

2.15 

2.13 

,05 

3.14 

2.83 

2.36 

1.04 

,20 

2.48 

2.21 

1.91 

1.54 

-’;0 

2.08 

1.87 

1.60 

1.31 

05 

3.37 

3.00  : 

2.61 

2.21 

20 

2.61 

2.34 

2.06 

1.76 

1*0 

2.20 

1.97 

1.76 

1.51 

HER  if  no  real  contrasts  are  present.  In  this  situation  ve  declare 
at  least  one  false  positive  if  and  only  if  we  declare  the  largest 
contrast  significant.  This  occurs  if  tR,  the  test  statistic  for 
examining  xq,  is  larger  than  c^,  its  a  level  critical  value.  But, 
cn  is  the  (l-o)  quantile  of  t  under  the  hypothesis  that  all  n  con¬ 
trasts  being  examined  have  means  y  =  0.  Hence,  EER  =  P(tn  >  c^)  *  a. 

Using  the  (l-o)  quantiles  as  a  level  critical  values  in  general 
situations, we  have  proved  the  following  theorem  for  n  »  2  and  n  «  3. 

Theorem:  The  experimentwise  error  rate  (EER)  of 
the  half-normal  plot  using  0  level  critical  values  is 
a,  regardless  how  many  real  contrasts  of  various 
sizes  are  present  in  the  experiment  being  analyzed. 

The  proofs  for  these  cases  and  much  empirical  support  for  the  case 
n  *  15  are  present  in  Zahn  (1969). 

3.1.  Qiilvteme*  Zetueen  the.  cwpOUavUy  VeAe/unined  OUticat  Value* 
and  VantiJL'b  OUtical  Value* 


Since  version  X  and  the  original  version  of  the  half-normal  plot 
use  identical  test  statistics,  critical  values  corresponding  to  those 


'^3,SS^r'*  ‘  -  ^  ^  c>- 
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guardrails  of  Figure  11a  of  Daniel  (1959).  When  no  real  contrasts  are 
present,  Daniel's  guardrails  and  the  simulated  guardrails  will  yield 
approximately  the  same  EER.  However,  if  one  large  contrast  is  present, 
Daniel's  .05  guardrail  may  yield  an  EER  as  large  as  ,20.  Daniel  refers 
to  his  .05  guardrail  ns  having  a"rejection  rate*  of  .05.  The  large 
difference  between  the  "rejection  rate"  of  Daniel*  •.  guardrails  and  the 
SR  actually  yielded  by  those  guardrails  persists  when  more  than  one 
real  contrast  is  present.  An  experimenter  using  Daniel's  original 
version  of  the  half-normal  plot  should  definitely  be  aware  of  this 
deficiency  in  the  critical  values  presented  in  Table  11a  of  Daniel 
(1959). 


l».  THE  MAIN  SIMULATION  STUDY 


Using  the  critical  values  from  Section  3,  we  have  performed 
computer  sampling  experiments  to  inve8tigate  the  detection  rate, 
error  rates,  end  variance  estimation  of  the  half-normal  plot  and  to 
compare  the  three  versions  described  in  Section.  2. 

k,l  Situations  Examined 

This  section  defines  the  notation  used  tc  describe  a  given 
"situation"  and  lists  all  3ituatons  which  were  examined.  A  situation 
is  a  specification  of  the  mntber  and  sizes  of  the  real  contrasts  pre¬ 
sent  in  the  experiment.  We  as  suae  that  the  state  of  nature  and  the 
experimental  design  are  such  that  all  to  del  1  Anova  assumptions  are 
satisfied. 

In  the  simplest  situation,  the  null  situation,  all  *  0.  The 
null  situation,  denoted  (0),  represents  the  state  of  nature  when  the 
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null  hypothesis  that  all  contrasts  are  null  is  true.  For  a  given  situ¬ 
ation  representing  a  particular  alternative  hypothesis,  one  or  more 
of  the  f  0.  Daniel  (1959)  and  Bimbaum  (1959)  investigated 
situations  in  which  one  ranged  from  lo  to  6o.  We  consider  situ¬ 
ations  in  which  from  one  to  six  of  the  are  non-zero.  A  situation 

with  as  many  as  six  real  contrasts  might  occur,  fcr  instance,  vhen  an 

6-2 

experimenter  encounters  a  2  fractional  factorial  experiment  in 
-  which  four  main  effects,  along  with  two  of  the  two-factor  interactions, 
are  non-zero. 

With  more  than  one  real  contrast  present,  the  situations  easiest 
to  characterize  are  those  in  which  contrasts  of  only  one  size  are  p. ^sent 
We  refer  to  these  as  Type  I  situations  and  consider  them  in  Section 
5.  We  define  (r,m)  to  be  a  situation  in  which  there  are  r  real  con¬ 
trasts,  each  of  size  m.  Thus  (4,6a)  implies  four  non-zero  contrasts* 
each  of  size  6a. 

In  Section  6  we  consider  Type  II  situations,  i.e.,  situations 
with  real  contrasts  of  two  different  sizes  present.  We  define 
(r^,m^;r^Xg)  to  be  a  situation  in  which  there  are  real  contrasts, 
each  with  mean  n^,  and  r^  real  contrasts,  each  with  mean  m£.  For 
instance,  (2,6o;2,8o)  implies  four  non-zero  contrasts,  two  of  size 
6o  and  two  of  size  8o.  The  remaining  eleven  contrasts  are  null, 
i.e. ,  each  has  mean  zero. 

We  define  an  r- situation  to  be  a  Type  I  situation  with  r  non¬ 
zero  contrasts.  Thus,  (4,6o)  is  a  particular  4-situation.  Similarly, 
we  define  an  r^-rg-situation  to  be  a  Type  II  situation  in  which  there 
are  r^  contrasts  of  size  n^  #  0.0  and  rg  contrasts  of  size  J*  0.0. 


The  !lain  Study  examined  ’Type  I  situations  (r,ra),  where  _  _ 
r  *  1,  2,  U,  6  and  ra  3  0(2o)fo.  We  also  examined  Type  II  situations 
{^.m^rg.Wg),  where  r^  -  rg  *  1,  2;  =  2 o,  fca,  6a;  and  <^m2  » 

Uo0  6o.  8o.  We  simulated  each  situation  1000  times  and  analyzed 
the  simulated  data  using  each  version  with  each  of  three  critical 
value  levels:  a  =  0.05,  0.20,  and  0.k0^  Tif discuss  the  performance 

of  version  S  in  analyzing  isq.-'riments  with  2  real  contrasts  of  size 
6 a  present,  we  will,  for  brevity,  refer  to  results  for  S  in  (2 ,6o). 

The  pseudo-random  standard  normal  deviates  used  in  our  simulation 
studies  were  generated  by  the  Harvard- Computing  Center’s  RANDOM 
function  subroutine  which  is  available  on  the  IBSYS,  Fortran  IV 
system  using  an  IBM  7090/U  computer. 


J»,2  Tht  ExpeAAjmental  Vasian 

The  Main  Study  may  be  viewed  as  a  factorial  experiment  in  which 

there  are  three  factors,  version,  a,  and  situation,  at  3,  3,  and  29 

levels,  respectively.  For  each  situation  v_*  decided  to  examine  the 

three  versions  at  each  o  using  the  aaae  1000  sets  of  15  rsnd.  Hence, 

in  each  situation  we  introduced  a  positive  correlation  between  the 

results  for  each  version  and  increased  the  precision  of  comparisons  among 

the  various  versions  in  the  same  situation.  This  design  is  analogous 

to  a  split-plot  or  nested  design  in  which  the  factor  ’’situation" 

is  applied  to  the  whole  plots  (each  independent  set  of  15  rsnd 

!< 

generated  in  one  simulation  of  2  experiment  constitutes  a  whole 
plot)  and  each  version  at  each  a  is  applied  to  one  sub-plot.  The 
sub-plots  are  sets  of  simulated  contrasts,  each  set  being  identical 


<3 


17 


to  the  set  of  15  incremented  rend  produced  when  the  15  rsnd  which 
constituted  the  whole  plot  are  modified-  according  to  the  situation 
bein?  simulated. 


1.3  OutvUa  ion  Evaluating  a  Vcn&ion’A  Penionmance. 

The  detection  rate  D(S)  is  the  average  proportion  of  real  con¬ 
trasts  present  in  situation  S  which  are  detected.  We  use  the  statistic 
d(S)  to  estimate  D(S),  where 


d(S)  =  l  i  p(j)/r,  and 


p(j)  =*  (number  of  simulations  in  which  exactly  J  of  the  r  real  contrasts 
were  detected)/lOOO. 

We  need  to  consider  extensions  of  this  criterion  to  measure  a 
version*?  detection  ability  in  tfype  II  situations.  Obviously,  two 
detection  rates,  and  dg,  are  useful  in  Type  IT  situations,  where 
<*i^rl’nl*r2’ra2^  =  ^nu®b^r  size  contrasts  detected  in 

the  1000  simulations  of  ( ;r2,ia  , ) ) /lOOOr^  , 
i  =  1,  2. 

Another  vital  aspect  of  a  version* s  performance  is  its  false 
positive  behavior.  One  criterion  here  is  the  experiaentwise  error 
rate  (EER).  Recall  that  this  is  the  probability  of  at  least  one  false 
positive  per  experiment  in  situation  S.  It  is  identical  to  the 
probability  error  rate  of  ’ttller  (19 66).  We  use  the  statistic 
fl(S)  to  estimate  the  EER,  where 


fl(S) 


15-r 

l  q(J),  and 
J*1 


IS 


number  of  simulations  in  which  exactly  J  of  the 
q(j)  s  i-5-r  null  contrasts  are  declared  significant 

1000 

The  Sim  runs  to  15-r  because,  if  r  real  contrasts  are  present,  it  is 
impossible  to  declare  more  than  15-r  false  nositives.  Restrictions 
built  into  the  procedure  for  a  particular  half-normal  plot  version 
often  set  the  maximum  number  of  false  positives  at  an  even  smaller 
number. 

Another  criterion  relating  to  false  positive  behavior  is  the 
average  number  of  false  positives  per  experiment  in  situation  S, 
which  we  refer  to  as  the  error  rate  per  experiment  (ERPE),  using  the 
terminology  of  Hartley  (1955)-  It  is  identical  to  .'liller 'r  (1966) 
expected  error  rate,  if  the  15  statements  being  made  about  the 
significance  or  insignificance  of  the  15  contrasts  in  one  2* 
fectorial  experiment  are  viewed  as  a  family,  in  Miller's  terminology. 
He  use  the  statistic  f2(S)  to  estimate  the  UKPE,  where 

15-r 


f2(S)  =13  q(J). 

For  evaluating  a  version's  final  estimate  of  o  in  a  given  situation 
the  criteria  are  the  obvious  ones:  sf,  tT'e  mean  of  the  version's 

P 

1000  final  estimates  of  o  in  this  situation,  and  s  (sf),  the  variance 
of  these  estimates.  The  first  criterion  enables  us  to  estimate  the 
bias  in  the  estimates  of  o  and  to  observe  how  this  bias  changes 
from  situation  to  situation.  The  second  criterion  provides  an 
estimate  of  the  precison  of  the  entire  variance  estimation  process 
and  facilitates  efficiency  comparisons. 


—^r. -•->  -aaiai 
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This  report  concentrates  on  the  measures  d,  fl,  and  s^.  Results 
for  the  other  measures  and  two  additional  versions  not  reported  here 
are  given  in  7.ahn  (1969)  and  are  available  from  the  author  on  request. 

5.  THE  E1PIR1CAL  BEHAVIORS  OF  VERSIONS  X,  S,  AKD  R  IH 
TYPJSI  SITUATIONS 

We  discuss  the  empirical  behaviors  of  versions  X,  S,  and  R 
in  the  null  situation  and  in  1-situations  for  a  »  .05,  .20,  .40. 

Since  differences  among  the  versions  and  the  performance  measures  are 
much  the  same  in  2-  and  U-situations  as  in  1-situations,  vs  consider 
only  one  version,  version  S,  in  these  situations  and  concentrate 
on  the  detection  rate.  In  6-situa-  ions  wc  consider  version  R,  the 
only  one  of  any  use  in  detecting  real  contrasts  when  so  many  are 
present. 

5.1  Nutt  Situation  Retulti 

Table  5.1  gives  fl,  sf,  and  s(sf)  for  each  version  in  the  null 
situation  at  each  of  three  critical  value  levels:  a  =*  ,05,  .20,  and  .Uo. 
Since  the  values  of  fl  are  estimates  of  binomial  proportions,  the 
standard  deviations  of  fl  using  .05,  .20,  and  ,1»0  level  critical 
values  are  approximately  /.05  x  .95/1000  =  .007,  .013,  and  .015, 

respectively.  The  standard  deviation  of  for  a  particular  version 
and  critical  value  level  is  easily  calculated  by  dividing  the 
appronriate  s(sf)  by  /LOGO  .  Though  the  differences  between  several 
values  of  fl  and  their  corresponding  a  are  too  large  to  attribute  to 
chance  alone,  thev  are  not  alarming  when  we  recall  that  there  is  also 
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sampling  error  in  the  estimated  percentiles  being  used  as  critical 
values.  As  ve  expect,  o  is  underestimated  more,  on  the  average,  for 
larger  a  as  more  of  the  larger  null  contrasts  are  declared  significant 
and  removed  from  the  final  estimate  of  o.  Mildly  surprising,  however, 
is  the  small  negative  bias  (-11.32  to  -1U.5J5)  in  sf  for  all  versions 
using  .1»0  level  critical  values. 

A  comparison  of  s(sf)  to  the  standard  deviation,  l/^2n,  of  s, 

the  sample  standard  deviation,  based  on  n  degrees  of  freedom  indicates 

that  the  estimate  of  a  given  by  the  >ialf-normal  plot  using  o  =*  .05  i'S 

as  efficient  as  an  s  based  on  13. k  ‘’honest"  degrees  of  freedom.  By 

honest,  ve  mean  that  the  variables  included  in  the  construction  of 

2 

s  are  all  i.i.d.  N(0,o  •).  A  simulation  study  not  reported  here  in¬ 
dicated  that  if  in  (0)  the  half-normal  plot  alvays  uses  all  15  con-  , 
trasts  to  estimate  o,  i.e.,  if  it  uses  0.0  level  critical  values, 
its  final  estimate  of  a  is  992  as  efficient  as  s.  'Thus,  the  lover 
efficiency  of  the  half-normal  plot  o  estimates  using  a  -  .05  is  not 
due  to  the  fact  that  o  is  estimated  by  the  slope  of  the  line  fitted 
to  the  error  contrasts,  rather  than  s.  Instead,  the  source  of  the 
inefficiency  is  that  each  half-normal  plot  is  allowed  to  declare 
contrasts  significant  and  remove  them  from  the  estimate  of  o.  Hence, 
occasionally  the  half-normal  plotfs  final  estimate  of  a  in  (0)  will 
be  based  on  lk  or  fever  null  contrasts.  Ibis  is  the  price  ve  pay 
for  having  the  pover  to  detect  real  contrasts  and  remove  them  fra* 
the  final  estimate  of  c.  Using  a  =  .20  and  a  -  .ko  ve  are  obviously 
more  likely  to  detect  real  contrasts,  but  t’/ie  price  ve  pay  in  the 
null  situation  is  that  sf  is  nov  only  as  efficient  as  an  s  based  on 
10.9  and  9.0  de<$rees  of  freedom,  respectively. 
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5*2  1 -Situation  Reautts 


Tabl e  5.2  describes  the  empirical  behavior  of  the  versions 
using  a  *  .05,  *20,  and  .ho  in  1-situations.  We  can  examine  the  pre¬ 
cision  of  the  results  given  in  Table  5*2  by  noting  that  d  and  fl  ore 
merely  estimates  of  binomial  proportions  and  their  variances  can  be 
esti*»»t~?.  by  tie  unual.  formu.o e.  Ihe  estimated  standard  error  of 
*f*  obviously  s(sf)/  1000.  Specific  values  of  s(s^)  are 

presented  in  Zahn  (1969)*  For  the  versions,  critical  value  levels. 


and  situations  in  Table  5*2  the  values  of  s(s^)  range  from  a  minimum 


of  .006  for  all  Versions  using  .05  level  critical  values  in  situation 
(l,lo)  to  a  maximum  of  .011  for  version  R  using  .05  level  critical  values 
in  situation  (lt5o),  vith  90JJ  of  the  values  being  in  the  range  .007 
to  .010. 

The  differences  among  the  values  of  d,  the  detection  rate,  in 
situations  (l,lo)  and  (l,2o)  for  the  three  versions  are  small.  Versions 
X  end  S  have  considerably  larger  detection  rates  than  R  vhen  the  sixe 
of  the  real  contrast  present  is  between  3o  and  To.  Figure  5.1 
staaorises  the  differences  in  d  among  the  versions  for  a  *  .05  and  .20. 

It  further  emphasizes  the  similarity  between  versions  X  and  S  and  the 
dissimilarity  between  them  and  version  R. 

Through  detection  rate  varies  considerably  from  version  to 
version,  all  fl.  values  are  close  to  their  respective  o*s.  The 
largest  differences  between  fl  and  a  occur  when  the  real  contrast 
Is  small.  Sven  in  (l,2o)  the  probability  of  the  real  contrast  being 
x^  is  only  0.V5.  Thus,  -ne  low  detection  rate  in  this  situation  results 
in  few  opportunities  to  declare  even  one  false  positive. 
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FIGURE  5.1 

Ve.te.Uion  Rate*  fan  Veuiom  x,  s,  and  R  in 
1 -Situation*  LUing  o  *  .05  and  .20 
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All  versions  have  a  tendency  to  overestimate  o ,  particularly 
when  small  real  contrasts  are  present.  The  bias  is  worst  in  (1, V)  and, 
for  the  better  versions,  is  negligible  in  (lv6o)  and  (l,8o)  for 
o  «  .05.  A  striking  aspect  of  sf*s  behavior  for  all  versions  is 
that  Sf  is  not  a  monotone  function  of  the  size  of  the  real  contrast. 
Rather,  af  increases  as  the  size  of  the  real  contrast  increases 
to  3a  or  l*o,  depending  on  the  version,  and  then  decreases  as  the 
size  increases  to  80.  To  explain  this,  we  note  that  while  there  are 
fewer  undetected  contrasts  in  (l,Uo)  than  in  (l,2o),  the  bias  caused 
by  an  undetected  contrast  is  greater  if  its  size  is  4o  than  if  its 
size  is  2o.  The  additional  bias  offsets  the  fact  that  the  detection 
rate  is  greater  in  (l,4o)  than  in  (l,2o)« 

Version  R's  infer!' r  detection  rate  affects  its  final  estimate 
of  o  in  two  rather  obvious  trays:  (l)  Any  undetected  real  contrast 
will  be  included  in  construction  of  sf.  Since  R  detected  the  fewest 
real  contrasts,  the  bias  caused  by  undetected  real  contrasts  will 
be  more  severe  for  R  than  for  the  other  versions.  (2)  The  estimates 
of  a  vary  more  for  version  R  than  they  do  for  the  other  versions; 
thus,  s(sf)  i»  larger  for  R  than  for  the  other  versions. 


5.3  2-  and  k-SUuaUon  Re&u&tA 


Since  the  differences  among  the  versions  and  criteria  are  much 
the  same  in  2-  and  4-situations  as  in  1-situations,  we  concentrate 
here  on  the  detection  rate,  perhaps  the  criterion  of  most  interest 
to  the  experimenter,  and  version  S.  Ve  concentrate  on  S  since  it  is 
generally  superior  to  the  other  versions,  especially  in  4-situations 
where  its  detection  rate  always  exceeded  the  detection  rates  of 
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th*  other  Tensions,  often  by  as  much  as  .07  to  .15.  Bapirical  results 
for  other  versions  and  ether  criteria  in  these  and  other  situations 
appear  in  Zahn  (1969)  and  are  available  from  the  author  on  request. 

Table  5.3  Rives  empirical  estinate3  of  the  detection  rate,  EER, 
and  average  final  estiaate  of  o  for  version  3  in  1-,  2-,  end  U- 
situations  using  a  *  *05*  .20,  and  .Uo.  Table  5.3  also  gives  the 
estimated  standard  errors  of  d  and  s^.  To  compute  the  standard  error 
of  d  in  2-,  !»-,  and  6-situations,  ve  first  note  that  results  for 
individual  real  contrasts  are  not  independent  within  trials.  Thus, 
d  behaves  as  a  proportion  estimated  by  cluster  sampling  and  its 
standard  error  can  be  esti  tasted  using  the  appropriate  formulae  in 
Cochran  (1963,  p.  64). 

As  ve  expect,  lor  a  fixed  number  of  real  contrasts  present, 
the  detection  rate  increases  ss  the  size  of  the  real  contrasts  in¬ 
creases.  However,  the  detection  rate  decreases  as  the  number  of  real 
contrasts  present  increases.  Version  S  has  a  moderately  smaller 
detection  rate  in  2— situations  than  in  1— situations.  It  has  a  much 
smaller  detection  rate  in  ^-situations  than  in  1-situations. 

Examining  Table  5.3  ve  see  that  increasing  o  from  .05  to  .20 
yields  a  sizable  increase  in  the  detection  rate  of  version  S. 
Increasing  o  to  .liO  yields  even  larger  detection  rates.  The  price 
ve  pay  for  the  larger  detection  r-tes  is,  of  course,  that  the  prob¬ 
ability  of  at  least  one  false  positive  is  much  larger.  However, 
another  benefit  helping  to  offset  this  coat  is  that  the  bias  in 


Sj,  decrease  sharply  as  a  increases 
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TABLE  5.3 

EmpouxiaZ.  Behavior  oft  Vvuion  S  xn  1-,  2-,  ami  ^Situation &  w iXh  %tat  Contna&tA  o£ 
riztA  2a,  1*3,  6a,  and  8a  Present,  U6ina  a  *  .05,  .20,  and  .1*0. 


taber  of 
3««1  Con¬ 
trasts 


Criterion 


Size  of  the  Real  Contrasts  Present 


2a  |  ho  i  6o  I  8a 


.05 

.080,  .007 

d,8.e.(d)  .20 

.219,  .011 

M 

.365,  .012 

.05 

.023 

n  .20 

.121 

.i*o 

.283 

.05 

1.171,  .008 

s_,s.e.(s,)  .20 

1.078,  .009 

f  f  .Uo 

.98U,  .009 

.533,  .Oil* 

.95**,  .006 

.821,  .010 

,997,  .001 

.920,  .007 

1.000,  .001 

.OUL 

,0L6 

.1 66 

1  .181 

.3Ul 

1  .335 

j 

1.233,  .013 

i  1.019,  .010 

1.016,  .010 

.91*2,  .007 

.921,  .003  ; 

.901,  .007 

' 

.05 

d,s.e.(d) 

.20 

• 

.1*0 

.05 

k 

fl 

.20 

.1*0 

.05 

®j,®.e.(sp 

.20 

.1*0 

976,  .007 
939*  .007 
893,  .007 


997,  .002 
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5.^  6-Situeution  ViuuttA 

Table  5.^  summarizes  the  performance  of  version  R  using  o  *  .05* 
.20,  and  .Uo  in  6-situations.  Only  version  R  does  not  break  down 
in  6-situations.  When  si:  real  contrasts  are  present,  the  test 
statistic  denominators  of  versions  X  and  S  will  either  equal  or  include 
the  smaller  real  contrasts.  This  contamination  always  occurs  and  is 
severe  enough  so  that  neither  version  detects  even  25  of  the  real 
contrasts  present.  Since  the  test  statistic  denominators  of  version 
R  are  based  on  at  most  the  sevea  smallest  order  statistics ,  R  still 
detects  some  real  contrasts  in  6-situations.  Of  course,  its  de¬ 


tection  rate  is  smaller  in  6-situations  than  in  t-situations. 

The  gap  between  a  and  fl  is  wider  in  6-situations  than  in 
1-,  2-,  or  b-situations.  In  addition,  this  gap  narrows  as  the  size 
of  the  real  contrasts  present  increases. 

The  bias  in  which  is  severe  in  6-situations  is  smallest  for 
version  R  since  it  has  the  largest  detection  rate  in  these  situations. 
However,  even  for  version  R  the  bias  is  large.  Furthermore,  for  R 
in  6-situations  is  exceedingly  variable,  which  is  not  surprising 
since  s^  will  equal  approximately  5.0  if  none  of  the  real  contrasts 
in  (6,8c)  are  detected  and  approximately  1.0  if  all  the  real  contrasts 
in  this  situation  are  detected. 

In  these  situations  the  bias  in  s^  can  be  greatly  reduced  and 
the  detection  rate  dramatically  increased  by  using  larger  o.  Hence, 


we  highly  recosmend  the  half-normal  plot  with  o  »  .20  or  .1»0  level 
critical  values  to  the  experimenter  who  is  doing  exploratory  re¬ 
search  and  might  encounter  a  It-  or  a  6-situation. 
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TABLE  5.4 

nehavioA  of  VeMi on  B  in  C-'ituation& 
llUng  a  •  .05,  .20,  and  .40 


Version  Criterion 


d,8.e.(d) 


Situations 


sf,s.e.(sf) 


“  ! 

— xz&r 

; - 

05 

.015,  .003 

f 

.132,  .010 

.1*30,  .015  ! 

■  .761,  .o:.3 

20 

.074,  .006 

.1(12,  .OIL 

.837,  .001  1 

i  983,  .oo'* 

1*0 

.189,  .009 

.697,  .013 

.976,  .001* : 

;  1.000,  .oco 

05 

.005 

.020 

.025 

.047 

20 

.041 

.129 

.190  i 

.177 

4o 

.135 

.306 

.378 

.385 

05 

1.603,  .009 

2.1*5U ,  .021 

2.602,  .047 

1.918,  .055 

20 

1.516,  ,QU 

I.889,  .029 

1.31*0,  .035 

1.024,  .018 

40 

1.360,  .Oil* 

1.355,  .026 

.979,  .016 

1 

.913,  .008 

I 
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5.5.  On  VanizVi  QuutO>n&  about  Plot  Hodl^ication6 

The  results  of  our  research  suggest  answers  to  some  questions 

raised  by  Daniel  (1959)  on  possible  variants  of  the  half-normal  plot* 

He  wonders  if  "an  invariable  rule  should  be  set  up,  using  only  some 

fixed  proportion  of  the  smaller  contrasts  to  estimate  error".  We 

oppose  the  idea  of  using  a  fixed  number  of  contrasts  in  a  single 

U 

replicate  2  factorial  experiment  to  estimate  error  because  of  its 
inefficiency  when  theri  are  only  one  or  two  real  contrasts. 

Another  query  is  whether  one  should  use  the  error  contrasts  to 
f ora  a  mean  square  error  term.  Since  fitting  a  line  to  the  error 
contrasts  .yields  a  highly  efficient ,  qui ck-and-easy  estimate  of  o, 
we  do  not  feel  that  it  is  necessary  to  form  the  mean  square  error 
term. 

Daniel  also  questions  if  one  should  "decline  to  use  only  higher- 
order  interactions  for  error  since  some  plot-splitting  is  almost 
inevitable  in  multi-stage  processes".  This  seems  visa  if  the  danger 
of  hidden  plot-splitting  is  sizable,  though  this  analysis  would  present 
many  other  complications  as  well. 

We  feel  that  one  should  "insist  on  at  least  partial  duplication 
of  2P~<1  experiments  when  no  good  previous  estimate  of  error  is  available" 
(Daniel,  1959),  especially  when  the  experimenter  thinks  that  as  many  as 
four  real  contrasts  may  be  present  when  p-q  =  U.  Without  the  partial 
duplication  in  these  difficult  situations,  the  error  variation  estimate 
is  badly  biased  when  several  real  contrasts  of  any  size,  or  a  few  small- 
to-medium-sized  real  contrasts,  are  present. 
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An  alternative  procedure  which  has  been  suggested  for  use  when 
as  many  aa  many  as  six  or  nine  real  contrasts  are  present  in  a  non- 
replicated  2*  factorial  experiment  is  the  chain-pooling  APOYA  (Holms 
knd  Berrettoni,  1969).  In  these  situations  the  chain  pooling  pro¬ 
cedure  might  be  superior  to  the  half-normal  plot  versions  discussed 
in  this  paper. 

5.6  The  Half-HoAiml  Plot  a&  on  Outlie*.  Rejection  PajOc eduAe 

Suppose  we  observe  15  random  variables  which  are  thought  to  be  i.i.d. 

A  2 

N(0,o  )  in  order  to  estimate  o  .  However,  if  some  of  the  observations 
are  outliers  with  means  u  #  0  we  will  want  to  exclude  these  observ  *- 
tions  from  the  estimation  of  o  .  How,  the  location  of  outliers  under 
these  circumstances  poses  the  same  problem  as  does  the  detection  of 

|l 

real  contrasts  in  a  single  replicate  2  factorial  experiment.  Thus, 
the  hslf-nor-iii  plot  can  also  be  used  as  an  outlier  rejection  procedure. 

While  doing  pilot  studies  for  the  Main  Study,  we  examined  the  power 
of  the  outlier  rejection  procedure  (BCT)  proposed  by  Bliss,  Cochran, 
and  Tukey  (1956).  The  pilot  study  results  demonstrated  a  serious  defect 
in  this  procedure.  Although  the  BCT  procedure  is  reasonably  sensitive 
to  outliers  when  only  two  outliers  are  present,  it  is  almost  useless  as 
an  outlier  rejection  procedure  when  three  or  four  are  present.  For 
example,  when  four  outliers,  each  distributed  If(6o,c  ),  are  present  in 
the  situation  described  in  the  previous  paragraph,  BCT  detects  approx¬ 
imately  125  of  them.  Since  all  fifteen  observations  are  used  in  the 
denominator  of  dCT's  rejection  criteria,  outliers  will  always  contam¬ 
inate  the  denominator.  The  consequences  of  this  contamination  are  most 
serious. 
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Obviously,  BCT  is  more  adversely  affected  by  an  increase  in  the 
mariber  of  outliers  than  is  the  half-normal  plot.  A  suggestion  based 
on  the  half-normal  plot  results  is  to  modify  the  BCT  denominator  so 
that  it  does  not  Include  the  larger  observations.  For  instance,  in 
the  situation  described  at  the  beginning  of  this  section,  a  modified 
denominator  which  might  be  of  interest  is  the  sum  of  the  eleven  observa¬ 
tions  closest  t-o  0.0.  This  modification  should  make  the  procedure  more 
robust,  though  it  will  sacrifice  some  efficiency  when  only  one  outlier 
is  present. 

We  feel  that  there  are  inadequate  warnings  in  the  statistical 
literature  as  to  the  dire  consequences  such  as  the  above  which  may 
result  from  including  outliers  in  the  denominators  of  the  test 
statistic.  Several  of  the  conventional  outlier  rejection  procedures 
include  all  observations  in  the  test  statistic  denominators.  For 
instance^  when  searching  for  one  outlier,  Grubbs  (19&9)  recommends 


vx 


— -  ,  where  x  -  the  largest  observation  in  the  sample. 


x  *  |  xi /n* 


We  question  whether  many  experimenters  appreciate  how  drastically  the 


sensitivity  of  procedures  such  as  Tq  nay  be  affected  by  the  inflation 


of  the  test  statistic  denominator  which  occurs  when  two  or  three  out¬ 
liers  are  included  in  it.  In  general,  our  suggestion  is  to  base  the 
test  statistic  denominator  on  cnly  the  smaller  observations  in  order 
to  minimize  the  probability  of  contaminating  the  denominator. 


-1 


i: 

J 


3 

i 


a 


33 


~  -  t  i^!#'?_*f!*P^'ei»i,s^T,»^K“4--^f-/“-; 


i 


Er 


fe 


f 

I 


I'  f 


6.  TOE  EMPIRICAL  BEHAVIORS  OF  VERSIONS 
X,  S,  AND  R  IN  TYPE  II  SITUATIONS 

This  section  discusses  the  results  of  using  the  half-normal  plot 
to  analyze  experiments  in  Type  II  situations,  i.e.,  situations  in 
vhich  there  are  real  contrasts  of  two  different  sizes.  The  sain 
results  for  Type  II  situations  are  summarized  in  Tables  6.1  and  6.2. 

6.1  if  1- Situation  ReduttA 

In  order  to  isolate  the  effect  of  the  presence  of  one  size  m0 
contrast  on  the  detection  rate  for  one  size  contrast  in  situation 
(l,ra^;l;Sg),  we  have  jonstructed  Table  6.1.  Consider  the  section  of 
this  table  devoted  to  version  X.  The  five  rove  of  this  section  re¬ 
present  detection  rate  curves  for  version  X  under  five  different 
sets  of  conditions.  The  first  row  gives  d(l,m),  for  m  =  2a,  ha,  6o, 
and  8o  in  1-situations;  the  second  row  gives  d^(l .m^jl^o)  for 
111^*20,1*0,60,  8o;  etc. 

To  understand  how  the  detection  rate  behaves  in  1-1-siturtions , 
we  examine  how  d^(l,l*o;l,n>g)  varies  as  a g  varies  from  Oo  to  8o  by 
considering  the  second  column  of  Table  6.1.  This  shows  that  a  si2e 
1*0  contrast  is  more  likely  to  be  detected  if  it  is  the  only  real 
contrast  present  than  if  another  real  contrast  is  present.  This  is 
reasonable  since  the  detection  rate  of  all  versions  has  been  observed 
to  decline  as  the  number  of  real  contrasts  present  increases.  As  a 
second  real  contrast  of  increasing  size  is  introduced,  we  note  that 
the  detection  rate  for  a  size  Uo  contrast  drops  at  first  from  0.1*4  to 
0.35  for  R  and  then  rises  to  0.1*1  as  the  size  of  the  second  contrast 
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TABLE  6.1 

Detection  Hate  ion.  One  Size  n^  Contnast  Mien  fine  Size  a2 
Contnast  is  A Iso  Pnesent  in  the  Expenimutt 

These  results  are  all  for  versions  usinit  0.0?  level  critical 
values. 


Version 

N^i58 

*2\ 

2a 

bn 

6a 

8a 

Oo 

0.091 

o.bb1 

0.811 

0.9?1 

2a 

0.06? 

0.35 

0.75 

0.95 

R 

ha 

0.07 

0.382 

0.76 

0.96 

6a 

0.06 

O.bi 

O.Rl2 

0.96 

8 a 

0.07 

O.bl 

o.8i 

0.962 

Oa 

0.121 

0.621 

0.961 

1.001 

2a 

0.092 

0.51 

0.93 

1.00 

X 

ha 

O.10» 

0.532 

0.93 

Oo  99 

6a 

0.09 

0.57 

0.952 

1.00 

8a 

0.10 

0.6l 

0.96 

1.002 

Oa 

O.U1 

0.6s1 

0.961 

1.001 

S 

2a 

0.082 

0.51 

0.92 

1.00 

bo 

0.10 

1.5b2 

0.9l* 

1.00 

6o 

0.10 

0.58 

2 

0.95 

1.00 

8a 

0.10 

0.6? 

0.96 

1.002 

*These  detection  rates  are  the  detection  rates  in  the  respective 

1- situations . 

2 

These  detection  rates  are  the  detection  rates  in  respective 

2- situations. 

* 

O.lQ*d^(l,2o;l,it0)  »  Detection  rate  for  the  size  2a  contrast  in 
situation  (3 .2o;  1,  bo) ,  i.e.,  10?!  of  the  size  2o  contrasts  present 
in  the  1000  simulations  of  situation  (l,2o;l,ba)  were  detected. 
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increases  to  fib.  The  explanation  of  »ne  rise  is  that  as  a^  increases 
in  size,  the  second  real  contra? c  is  acre  likely  to  be  detected. 
Consequently,  the  size  ha  contrast  is  examined  for  significance  more 
often  and  is  more  likely  to  be  detected.  'The  detection  rate  for  the 
size  Uo  contrast  in  the  presence  of  an  additional,  large  real  con¬ 
trast  is,  however,  less  than  the  detection  rate  in  (l,**o).  The 
reason  is  as  follows.  In  (l,Uo )  the  real  contrast  is  present  with 
lU  null  contrasts,  whereas  here  the  size  l*o  contrast  is  present 
with  one  large  real  contrast,  i.e.,  the  second  real  contrast,  and 
only  13  null  contrasts.  Thus,  less  information  to  estimate  o  is 
available  than  in  (l,Uo )  and  we  expect  to  see  a  slightly  smaller  de¬ 
tection  rate  for  the  size  ha  contrast  than  in  (1,^0 ) ,  even  when  the 
second  real  contrast  is  large.  The  dips  and  subsequent  rises  in 
detection  rate  occur,  to  within  sampling  errors,  for  every  version 
»nd  every  size  contrast.  Though  consistent,  these  dips  and  rises  are 
not  large. 

Similar  comments  hold  for  the  2-2-situation  results  which  are 
given  in  Table  6.2.  However,  in  these  situations  the  dips  and  rises 
previously  noted  are  large. 

7.  NOMINATION  AND  THE  HALF-NOR!' !AL  PLOT:  SO;  IE  COMPARISONS 

In  analyzing  experiments  lacking  a  classical,  internal  estimate 
of  error  variance,  another  approach  is  to  decide  a  priori  to  combine 
the  higher  order  interactions  to  form  an  estimate  Of  error  variance. 
This  approach,  which  we  refer  to  as  ’’nomination”,  has  been  widely 
used  by  experimenters  doing  single  replicate  factorial  experiments  or 
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TABLE  6.2 

Detection  Route  ioK  the  Two  Size  ^  Con&uuti  When  Tm  Size 
Confuute  axe  A teo  Pfieient  in  the  Experiment 

These  results  are  all  for  versions  using  0.05  level  critical 

values. 


Version 

T\“ 
h  \ 

2a 

Uo 

60 

80 

Oo 

0.062 

0.382 

o.8l2 

0.962 

2c 

o.oUu 

0.2U 

0.63* 

0.90 

R 

Uo 

O.OU 

u 

0.25 

0.60 

0.88 

6a 

0.06 

0.31 

0.671* 

0.39 

- 

8c 

0.05 

0.32 

0.72 

0.91* 

0o 

0.092 

2 

0.53 

0.952 

1.002 

2o 

o.ou1* 

0.29 

0.82 

0.99 

X 

Uo 

0.05 

li 

0.21 

0.61 

0.91 

6o 

0.08 

O.UO 

0.72^ 

0.91 

•3o 

0.07 

0.U5 

0.86 

ll 

0.95 

Oo 

0.082 

0.5U2 

0.952 

2 

1.00* 

2a 

o.ou1* 

0.33 

0.85 

1.00 

s 

Ua 

0.06 

u 

0.27 

0.77 

0.98 

6a 

0.08  - 

0.U9 

0.S61* 

0.99 

8a 

0.07 

G.5U 

0.95 

1.001* 

These  detection  rates  are  the  detection  rates  in  the  respective 
2-situations  _. 

4 

These  detection  rates  are  the  detection  rates  in  the  respective 
^-situations . 

0.63*dg(  J,2o-j2,6n)  =  Detection  rate  for  the  size  6o  contrasts  in 

situation  (2,2o;2,6c),  i.e.,  8l£  of  the  size  Co  real  contrasts 
present  in  the  1000  simulations  of  situation  (2,2o;2,6o)  vere 
detected. 
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their  fractions  in  well- researched  areas,  such  as  agriculture, 
where  there  is  ample  evidence  from  earlier  experiments  that  the  real 
effects  of  such  high  order  interactions  are  usually  negligible. 

A  nomination  procedure  which  is  illustrated  by  an  example  in 
Davies  (191ft,  p.  27M  consists  of  the  following  steps: 

(1)  The  experimenter  assumes  that  certain  contrasts,  the 

2 

"nominated"  contrasts,  are  null  and  constructs  sR  , 

2 

an  estimate  of  j  ,  from  them. 

(2)  Each  of  the  remaining  contrasts  is  tested  for  significance 

2 

by  dividing  its  square  by  and  comparing  the  result  to 

a  rercentage  point  of  the  F-distribution. 

Since  the  results  of  the  Main  Study  indicate  that,  barring 
the  breokdoim  of  a  procedure,  increasing  the  EER  and  ERPE  results  in 
an  increase  in  detection  rate,  we  shall  compare  in  this  section  the 
half-normal  plot  and  a  nomination  procedure  with  similar  EER  and  ERPE. 
For  the  EER  of  the  nomination  procedure  to  be  comparable  to  the  EER 
of  the  half-normal  plot  using  .05  level  critical  values,  the  0.5/f 
percentage  point  of  the  appropriate  F  should  be  used,  while  the  1.0JC 
point  should  be  used  if  we  desire  the  ERPE's  of  the  two  procedures 
to  be  comparable. 

Another  difficulty  arises  while  attempting  to  compare  nomination 
to  the  half-normal  plot:  In  order  to  calculate  the  detection  rate  of 
the  nomination  procedure  from  taoles  of  the  noncentrd  F-  or  t- 
distribution,  we  must  make  the  basic  assumption  that  the  experimenter 
nominated  only  null  contrasts.  This  assumption  biases  the  results  -  n 


favor  of  nomination.  If  it  is  true,  all  the  real  contrasts  will  be 

tested  for  significance.  This  assumption  precludes  any  contamination 

by  real  contrasts  of  either  the  test  statistic  denominator  used  by 

nomination  or  the  final  estimate  of  error  variance  given  by  nomination, 
o 

Furthermore,  s“  will  not  become  increasingly  inflated  as  the  number 
of  real  contrasts  increases. 

tfe  let  H(e,F(l,e,p))  denote  the  variant  of  nomination  in  which 

e  contrasts  are  nominated  and  F(l,e,p)  is  used  as  the  critical  value. 

U 

In  this  section  ve  restrict  our  attention  to  2  factorial  experiments 
and  two  nomination  procedures;  one  nominating  five  error  contrasts, 
lf(5,F),  and  the  other  nominating  ten,  N(10,F).  By  interpolation  in 
Tang's  tables  (1938)  and  in  the  non-central  t-tables  of  Resnikoff  and 
Liebermah  (.1957),  we  can  calculate  the  power  of  N(5,F)  and  N(10,F) 
in  (1,2 p),  (l,l4o),  (1,6a)  using  various  F-percentage  points  as 
critical  values.  These  results  are  given  in  Table  7.1. 

Although  the  half-normal  plot  has  been  unfavorably  compared 
to  nomination  procedures  in  this  section,  it  gives  a  very  good 
account  of  itself  with  respect  to  detection  race  when  compared  to 
nomination  procedures  with  similar  EER  and  ERPE.  The  results  of 
this  section  indicate  that  the  half-normal  plot  has  a  distinctly 
larger  detection  rate  than  a  nomination  procedure  using  the  same 
EEP  if  the  experimenter's  prior  information  will  only  allow  him  to 
nominate  3-  and  1*- factor  interactions.  However,  if  ha  can  accurately 
nominate  ten  error  contrasts,  nomination  will  have  a  larger  detection 
rate  than  the  half-normal  plot  if  four  real  contrasts  are  present. 
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TABLE  7.1 

Vetection  Rate.  o$  Jl(5,F)  and  n(io,F) 
U6tng  Thn.ee  Cniticai  VaZued 


n(5,F(i,5,-)) 

(Nominating  all  3-  and  U- fact or  interactions) 


Critical 

Value 

(1,2a) 

Situation 

(l.fco) 

(1,6a) 

F(l,5,.05) 

.3U 

.86 

.99 

F(i,5,.0l) 

.13 

.55 

.91 

F(l,5,.005) 

.08 

.39 

.79 

N(l0,F(l,10,.)) 

'Nominating  all  3-  and  U- factor  interactions 

2-factor  interactions) 

and  5  of 

Critical 

Value 

(1,2a) 

Situation 

(1,4a) 

(1,6a) 

F(l,10,.05) 

.uu 

.93 

1.00 

F(l,10,.0l) 

.19 

.77 

.99 

F(l,10,.005) 

not 

tabled 

.65 

.97 

1*0 


8.  CONCLUSIONS  AND  RECOMMENDATIONS 

8.1  The  Hat^-Uonmat  Plot  and  2*  factorial.  Experiment* 

Since  the  half-normal  plot  is  intended  to  indicate  which  con¬ 
trasts  are  real  and  to  estimate  o,  ve  will  judge  it  by  these  standards. 
As  regards  detection  rate,  we  observe  in  Section  5  that  the  half- 
normal  plot  using  .05  level  critical  values  has  a  detection  rate 
as  large  as  0.12,  0.62,  0.96,  and  1.00  in  1-situations  for  contrasts 
of  size  2o,  Lo,  6o,  and  8o,  respectively;  0.09,  0.51*,  0.95,  ant  1.00 
in  2-situations;  O.Q*t,  0.30,  0.86,  and  1.00  U-situations ;  and 
0.01,  0.13,  0.1*3,  and  0.76  in  6-situations.  Here,  we  are  reporting 
only  the  results  for  the  version  having  the  highest  detection  rate 
in  each  situation.  These  results  lead  to  our  conclusion  that  the 
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half-normal  plot  is  a  suitable  procedure  for  analyzing  2  factorial 
experiments,  provided  that  four  or  fever  real  contrasts  are  present. 

The  decline  in  detection  rate  as  the  number  of  real,  contrasts 
increases  should  be  noted.  The  mo3t  drastic  decrease  in  detection 
rate  occurs  as  the  number  of  real  contrasts  increases  to  six.  In 
6-situaticns  the  only  version  with  any  detection  rate  at  all  is  R; 
its  detection  rate  is  reported  in  the  previous  paragraph.  The 
other  versions  have  a  detection  rate  of  at  most  .03  in  6-situations. 

8.2  A  compani&on  0($  the  Ualf-Notmal  Plot  Ven*ion&  in  Variou*  Situation 1 
In  situation  (0)  all  versions  are  quite  similar. 

In  1-situations  versions  X  and  S  are  similar  to  each  other  and 
superior  to  version  R  in  every  way:  they  have  larger  detection  ratei 
and  yield  less  biased,  less  variable,  final  estimates  of  0. 

.  w.....  .  __ ,  - 
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In  2-situations  version  R  again  has  little  to  recommend  it* 
However*  it  does  compare  to  the  other  versions  slightly  better  in  2- 
situations  than  in  1-situations.  The  performances  of  versions  X 
and  S  are  similar. 

In  U-situations  version  S  is  the  best  version  in  terms  of 
detection  rate  and  estimation  of  o. 

In  6-situations  there  is  little  to  recommend  any  version. 

Version  R  is  the  only  -us  which  does  not  collapse.  However,  its 
detection  rate  is  much  smaller  than  it  was  in  ^-situations  and  its 
final  estimate  of  o  is  badly  biased. 

Of  the  three  half-normal  plot  versions  considered  we  recommend 
version  S  on  the  basis  of  its  steady  performance  in  1-,  2-,  and  U- 
situations .  If  the  experimenter  expects  more  than  four  real  contrasts, 
we  advise  him  to  consider  whether  he  can  afford  an  EER  of  .20  or  .1*0. 

If  he  can,  we  recommend  version  R  with  .20  or  .1*0  level  critical  values. 
Before  acting  on  this  recommendation ,  the  experimenter  would  be  well 
advised  to  consider  whether  a  second  replicate  or  a  larger  fractional 
replicate  might  pay  for  itself  by  dramatically  increasing  the  detection 
rate. 


8.3  nomination  vets  os  tin  Hatfa’.'oAmal  Plot 

As  described  in  Section  7,  nomination  has  a  smaller  detection 
rate  than  versions  of  the  half-normal  plot  with  equivalent  EER's,  un¬ 
less  the  experimenter  can  accurately  nominate  ten  error  contrasts. 

The  half-normal  plot  estimates  o  mors  efficiently  than  H(5,F>  when 
the  real  contrasts  are  large  (8a).  However,  if  the  contrasts  are 
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only  medium-sized  and  if  the  nomination  is  accurate,  N(5,F)  is  more 
efficient  than  the  half-normal  plot  versions  examined  in  the  Main 
Study.  Hie  procedure  N(lO,F)  is  as  efficient  as  the  half-normal 
plot  et-en  when  the  real  contrasts  are  large. 

If  equivalent  EER's  are  desired,  our  recoamendation  is  to  use 
the  half-normal  plot  unless  almost  all  null  contrasts  can  be 
accurately  nominated. 


ACKMOWLEDffiffiNTS 

Research  for  this  article  was  conducted  vhile  the  author  was  a 
graduate  student  in  the  Department  of  Statistics,  Harvard  University, 
and  was  supported  by  the  Office  of  Naval  Research  Contract 
R0001U-67-A-0298-0017  and  the  National  Science  Foundation.  During 
revisions  at  Florida  State  University  the  author  was  partially 
supported  by  Biometry  Training  Grant  5  T01  <31913  from  the  National 
Institute  General  Medical  Sciences. 


REFERENCES 


Birwbaum.A.  (1959).  On  the  analysis  of  factorial  experiments 
with  replication.  Technometrics.  .1,  3^2-357. 

BLankenskip,  J.  (1965).  Use  of  order  statistics  for  graphical  esti¬ 
mation.  Ph.D.  Thesis,  Oklahoma  State  University. 

Bliss,  C.  I.,  Cochran,  W.  0.,  and  Tukey,  J.  W.  (195 6).  A  rejection 
criterion  based  on  the  range.  Biometrika.  j*3,  U18-U22. 

Cochran,  ’f.  0.  (1963).  Sampling  Technioues.  Second  Edition,  John 
Wilty,  New  York. 

Daniel,  C.  (1959).  Use  of  half-normal  plots  in  interpreting 

factorial  two-level  experiments.  Technometrics .  1_,  311-341. 

Davies,  0.  L.  (195** )  (Editor).  Design  and  Analysis  of  Industrial 
Experiments .  Second  Edition,  Oliver  and  Boyd,  London,  and 
Hafner  Publishing  Company,  New  York. 

Eerguson,  T.  S.  (?.Q60).  Discussion  of  the  papers  of  ’lessrs. 

Anscombe  and  Daniel.  Technometrics.  2,  159. 

Hartley,  H.  0,  (1955).  Some  recent  developments  in  analysis  of 
variance.  Com.  Pure  Appl.  Math..  8_,  4 7-72. 

Holms,  A.  G.,  and  Berrettoni,  J.  II.  (1969).  Chain-pooling  ANOVA 
for  two-level  factorial  replication-free  experiments. 
Technometrics .11 .  725-746. 

Miller,  R.  G.  (1966).  ■  .Simultaneous  Statistical  Inference. 

?fc  Craw-Hill  Book  Company,  New  York. 

Haanikoff,  G.  J .  5  and  Lieberman,  G.  J.  (1957).  Tables  of  the 
Won-central  t-Dlstribution.  Stanford  University  Press, 
Stanford,  California. 

Tang,  P.  C.  (1939).  The  power  function  of  the  analysis  of 
variance  tests  with  tables  and  illustrations  of  their 
use.  Statist.  Res.  Tlera..  2,  126-149. 

fttlks,  S.  S.  (1962).  Katheraatical  Statistics.  John  Wilev. 

New  York.  " 

Zahn,  D.  A.  {1969).  An  empirical  study  of  the  half-normal  plot. 
Fh.D.  Thesis,  Harvard  University. 


